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INTELLIGENT PERSONAL ASSISTANTS 

TECHNICAL FIELD 

This description relates to techniques for developing and using a computer 
interface agent to assist a computer system user. 

BACKGROUND 

5 A computer system may be used to accomplish many tasks. A user of a computer 

system may be assisted by a computer interface agent that provides information to the 
user or performs a service for the user. 

SUMMARY 

10 In one general aspect, implementing an intelligent personal assistant includes 

receiving an input associated with a user and an input associated with an application 
program, and accessing a user profile associated with the user. Context information is 
extracted from the received input, and the context information and the user profile are 
processed to produce an adaptive response by the intelligent personal assistant. 

15 Implementations may include one or more of the following features. For example, 

the application program may be a personal information management application program, 
an application program to operate a computing device, an entertainment application 
program, or a game. 

An adaptive response by the intelligent personal assistant may be associated with 
20 a personal information management application program, an application program to 
operate a computing device, an entertainment application program, or a game. 

In another general aspect, an apparatus for implementing an intelligent social 
agent includes an information extractor, an adaptation engine, and an output generator. 
The information extractor is configured to access a user profile associated with the user, 
25 receive an input associated with a user, and extract context information from the received 
input. The adaptation engine is configured to receive the context information and the user 
profile from the information extractor and process the context information and the user 
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profile to produce an adaptive output. The output generator is configured to receive the 
adaptive output and represent the adaptive output in the intelligent social agent. 

Implementations may include one or more of the features noted above and one or 
more of the following features. For example, the information extractor may be 
5 configured to receive physiological data or application program information associated 
with the user. The information extractor may be configured to extract information about 
an affective state of the user from physiological information associated with the user, 
vocal analysis information associated with the user by extracting verbal content and 
analyzing speech characteristics of the user, or verbal information from the user. 
10 Extracting context information also may include extracting a geographical position of the 
user and extracting information based on the geographical position of the user by using a 
global positioning system. Extracting context information may include extracting 
information about the application context associated with the user or about a linguistic 
style of the user. 

1 5 An output generator may be a verbal generator, the adaptation engine may be 

configured to produce a verbal expression, and the verbal generator may produce the 
verbal expression in the intelligent social agent. An output generator may be an affect 
generator, the adaptation engine may be configured to produce a facial expression, and 
the affect generator may produce the facial expression in the intelligent social agent. The 

20 output generator may be a multi-modal generator that represents an adaptive output in the 
intelligent social agent using at least one of two modes. One mode may be a verbal mode 
and another mode may be an affect mode. The adaptive engine may be configured to 
produce a facial expression and a verbal expression that is represented in the intelligent 
social agent by the multi-modal output generator. The adaptation engine may be 

25 configured to produce an emotional expression in the intelligent social agent. The output 
generator may be configured to represent the emotional expression in the intelligent 
social agent. 

In yet another general aspect, implementing an intelligent social agent includes 
receiving an input associated with a user, accessing a user profile associated with the user, 
30 extracting context information from the received input, and processing the context 

information and the user profile to produce an adaptive output to be represented by the 
intelligent social agent. 
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Implementations may include one of more of the features noted above or one or 
more of the following features. For example, the input associated with the user may 
include physiological data or application program information associated with the user. 
Extracting context information may include extracting information about an affective 
5 state of the user from physiological information, vocal analysis information, or verbal 
information associated with a user. Extracting context information also may include 
extracting a geographical position of the user and extracting information based on the 
geographical position of the user. Extracting context information may include extracting 
information about the application context associated with the user or about a linguistic 
10 style of the user. An adaptive output to be represented by the intelligent social agent may 
be a verbal expression, a facial expression, or an emotional expression. 

Implementations of any of the techniques described above may include a method 
or process, a computer program on computer-readable media, a system or an apparatus, or 
a mobile device for implementing an intelligent social agent that interacts with a user or 
1 5 other type of system . 

The details of one or more of the implementations are set forth in the 
accompanying drawings and description below. Other features and advantages will be 
apparent from the descriptions and drawings, and from the claims. 



DESCRIPTION OF THE DRAWINGS 

20 FIG. 1 is a block diagram of a programmable system for developing and using an 

intelligent social agent. 

FIG. 2 is a block diagram of a computing device on which an intelligent social 
agent operates. 

FIG. 3 is a block diagram illustrating an architecture of a social intelligence 

25 engine. 

FIGS. 4A and 4B are flow charts of processes for extracting affective and 
physiological states of the user. 

FIG. 5 is a flow chart of a process for adapting an intelligent social agent to the 
user and the context. 

30 FIG. 6 is a flow chart of a process for casting an intelligent social agent. 
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FIGS. 7-10 are block diagrams showing various aspects of an architecture of an 
intelligent personal assistant. 

Like reference symbols in the various drawings indicate like elements. 



DETAILED DESCRIPTION 

5 Referring to FIG 1, a programmable system 100 for developing and using an 

intelligent social agent includes a variety of input/output (I/O) devices (e.g., a mouse 102, 
a keyboard 103, a display 104, a voice recognition and speech synthesis device 105, a 
video camera 106, a touch input device with stylus 107, a personal digital assistant or 
"PDA" 108, and a mobile phone 109) operable to communicate with a computer 110 

1 0 having a central processor unit (CPU) 1 20, an I/O unit 1 30, a memory 1 40, and a data 
storage device 150. Data storage device 150 may store machine-executable instructions, 
data (such as configuration data or other types of application program data), and various 
programs such as an operating system 152 and one or more application programs 154 for 
developing and using an intelligent social agent, all of which may be processed by CPU 

15 120. Each computer program may be implemented in a high-level procedural or object- 
oriented programming language, or in assembly or machine language if desired; and in 
any case, the language may be a compiled or interpreted language. Data storage device 
150 may be any form of non-volatile memory, including by way of example 
semiconductor memory devices, such as Erasable Programmable Read-Only Memory 

20 (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), and 
flash memory devices; magnetic disks such as internal hard disks and removable disks; 
magneto-optical disks; and Compact Disc Read-Only Memory (CD-ROM). 

System 100 also may include a communications card or device 160 (e.g., a 
modem and/or a network adapter) for exchanging data with a network 170 using a 

25 communications link 175 (e.g., a telephone line, a wireless network link, a wired network 
link, or a cable network). Alternatively, a universal system bus (USB) connector may be 
used to connect system 100 for exchanging data with a network 170. Other examples of 
system 100 may include a handheld device, a workstation, a server, a device, or some 
combination of these capable of responding to and executing instructions in a defined 

30 manner. Any of the foregoing may be supplemented by, or incorporated in, ASICs 
(application-specific integrated circuits). 
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Although FIG. 1 illustrates a PDA and a mobile phone as being peripheral with 
respect to system 100, in some implementations, the functionality of the system 100 may 
be directly integrated into the PDA or mobile phone. 

FIG. 2 shows an exemplary implementation of intelligent social agent 200 for a 
5 computing device including a PDA 210, a stylus 212, and a visual representation of a 
intelligent social agent 220. Although FIG 2 shows an intelligent social agent as an 
animated talking head style character, an intelligent social agent is not limited to such an 
appearance and may be represented as, for example, a cartoon head, an animal, an image 
captured from a video or still image, a graphical object, or as a voice only. The user may 

10 select the parameters that define the appearance of the social agent. The PDA may be, for 
example, an iPAQ™ Pocket PC available from COMPAQ. 

An intelligent social agent 200 is an animated computer interface agent with social 
intelligence that has been developed for a given application or device or a target user 
population. The social intelligence of the agent comes from the ability of the agent to be 

15 appealing, affective, adaptive, and appropnate when interacting with the user. Creating 
the visual appearance, voice, and personality of an intelligent social agent that is based on 
the personal and professional characteristics of the target user population may help the 
intelligent social agent be appealing to the target users. Programming an intelligent social 
agent to manifest affect through facial, vocal and linguistic expressions may help the 

20 intelligent social agent appear affective to the target users. Programming an intelligent 
social agent to modify its behavior for the user, application, and current context may help 
the intelligent social agent be adaptive and appropriate to the target users. The interaction 
between the intelligent social agent and the user may result in an improved experience for 
the user as the agent assists the user in operating a computing device or computing device 

25 application program. 

FIG. 3 illustrates an architecture of a social intelligence engine 300 that may 
enable an intelligent social agent to be appealing, affective, adaptive, and appropriate 
when interacting with a user. The social intelligence engine 300 receives information 
from and about the user 305 that may include a user profile, and from and about the 

30 application program 310. The social intelligence engine 300 produces behaviors and 
verbal and nonverbal expressions for an intelligent social agent. 

The user may interact with the social intelligence engine 300 by speaking, 
entering text, using a pointing device, or using other types of I/O devices (such as a touch 
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screen or vision tracking device). Text or speech may be processed by a natural language 
processing system and received by the social intelligence engine as a text input. Speech 
will be recognized by speech recognition software and may be processed by a vocal 
feature analyzer that provides a profile of the affective and physiological states of the user 
5 based on characteristics of the user's speech, such as pitch range and breathiness. 

Information about the user may be received by the social intelligence engine 300. 
The social intelligence engine 300 may receive personal characteristics (such as name, 
age, gender, ethnicity or national origin information, and preferred language) about the 
user, and professional characteristics about the user (such as occupation, position of 

1 0 employment, and one or more affiliated organizations). The user information received 
may include a user profile or may be used by the central processor unit 120 to generate 
and store a user profile. 

Non-verbal information received from a vocal feature analyzer or natural language 
processing system may include vocal cues from the user (such as fundamental pitch and 

1 5 speech rate). A video camera or a vision tracking device may provide non-verbal data 
about the user's eye focus, head orientation, and other body position information. A 
physical connection between the user and an I/O device (such as a keyboard, a mouse, a 
handheld device, or a touch pad) may provide physiological information (such as a 
measurement of the user's heart rate, blood pressure, respiration, temperature, and skin 

20 conductivity). A global positioning system may provide information about the user's 
geographic location. Other such contextual awareness tools may provide additional 
information about a user's environment, such as a video camera that provides one or more 
images of the physical location of the user that may be processed for contextual 
information, such as whether the user is alone or in a group, inside a building in an office 

25 setting, or outside in a park. 

The social intelligence engine 300 also may receive information from and about 
an application program 310 running on the computer 110. The information from the 
application program 310 is received by the information extractor 320 of the social 
intelligence engine 300. The information extractor 320 includes a verbal extractor 322, a 

30 non-verbal extractor 324, and a user context extractor 326. 

The verbal extractor 322 processes verbal data entered by the user. The verbal 
extractor may receive data from the I/O device used by the user or may receive data after 
processing (such as text generated by a natural language processing system from the 
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original input of the user). The verbal extractor 322 captures verbal content, such as 
commands or data entered by the user for a computing device or an application program 
(such as those associated with the computer 110). The verbal extractor 322 also parses 
the verbal content to determine the linguistic style of the user, such as word choice, 
5 grammar choice, and syntax style. 

The verbal extractor 322 captures verbal content of an application program, 
including functions and data. For example, functions in an email application program 
may include viewing an email message, writing an email message, and deleting an email 
message, and data in an email message may include the words included in a subject line, 

10 identification of the sender, time that the message was sent, and words in the email 

message body. An electronic commerce application program may include functions such 
as searching for a particular product, creating an order, and checking a product price and 
data such as product names, product descriptions, product prices, and orders. 

The nonverbal extractor 324 processes information about the physiological and 

15 affective states of the user. The nonverbal extractor 324 determines the physiological and 
affective states of the user from 1) physiological data, such as heart rate, blood pressure, 
blood pulse volume, respiration, temperature, and skin conductivity; 2) from the voice 
feature data such as speech rate and amplitude; and 3) from the user's verbal content that 
reveals affective information such as "I am so happy" or "I am tired". Physiological data 

20 provide rich cues to induce a user's emotional state. For example, an accelerated heart 
rate may be associated with fear or anger and a slow heart rate may indicate a relaxed 
state. Physiological data may be determined using a device that attaches from the 
computer 110 to a user's finger and is capable of detecting the heart rate, respiration rate, 
and blood pressure of the user. The nonverbal extraction process is described in FIG. 4. 

25 The user context extractor 326 determines the internal context and external 

context of the user. The user context extractor 326 determines the mode in which the user 
requests or executes an action (which may be referred to as internal context) based on the 
user's physiological data and verbal data. For example, the command to show sales 
figures for a particular period of time may indicate an internal context of urgency when 

30 the words are spoken with a faster speech rate, less articulation, and faster heart rate than 
when the same words are spoken with a normal style for the user. The user context 
extractor 326 may determine an urgent internal context from the verbal content of the 
command, such as when the command includes the term "quickly" or "now". 
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The user context extractor 326 determines the characteristics for the user's 
environment (which may be referred to as the external context of the user). For example, 
a global positioning system (integrated within or connected to the computer 110) may 
determine the geographic location of the user from which the user's local weather 
5 conditions, geology, culture, and language may be determined. The noise level in the 
user's environment may be determined, for instance, through a natural language 
processing system or vocal feature analyzer stored on the computer 110 that processes 
audio data detected through a microphone integrated within or connected to the computer 
110. By analyzing images from a video camera or vision tracking device, the user context 

1 0 extractor 326 may be able to determine other physical and social environment 

characteristics, such as whether the user is alone or with others, located in an office 
setting, or in a park or automobile. 

The application context extractor 328 determines information about the 
application program context. This information may, for example, include the importance 

15 of an application program, the urgency associated with a particular action, the level of 
consequence of a particular action, the level of confidentiality of the application or the 
data used in the application program, frequency that the user interacts with the application 
program or a function in the application program, the level of complexity of the 
application program, whether the application program is for personal use or in an 

20 employment setting, whether the application program is used for entertainment, and the 
level of computing device resources required by the application program. 

The information extractor 320 sends the information captured and compiled by the 
verbal extractor 322, the non-verbal extractor 324, the user context extractor 326, and the 
application context extractor 328 to the adaptation engine 330. The adaptation engine 

25 330 includes a machine learning module 332, an agent personalization module 334, and a 
dynamic adaptor module 336. 

The machine learning module 332 receives information from the information 
extractor 320 and also receives personal and professional information about the user. The 
machine learning module 332 determines a basic profile of the user that includes 

30 information about the verbal and non-verbal styles of the user, application program usage 
patterns, and the internal and external context of the user. For example, a basic profile of 
a user may include that the user typically starts an email application program, a portal, 
and a list of items to be accomplished from a personal information management system 
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from after the computing device is activated, the user typically speaks with correct 
grammar and accurate wording, the internal context of the user is typically hurried, and 
the external context of the user has a particular level of noise and number of people. The 
machine learning module 332 modifies the basic profile of the user during interactions 
5 between the user and the intelligent social agent. 

The machine learning module 332 compares the received information about the 
user and application content and context with the basic profile of the user. The machine 
learning module 332 may make the comparison using decision logic stored on the 
computer 1 10. For example, when the machine learning module 332 has received 

10 information that the heart rate of the user is 90 beats per minute, the machine learning 
module 332 compares the received heart rate with the typical heart rate from the basic 
profile of the user to determine the difference between the typical and received heart 
rates, and if the heart rate is elevated a certain number of beats per minute or a certain 
percentage, the machine learning module 332 determines the heart rate of the user is 

15 significantly elevated and a corresponding emotional state is evident in the user. 

The machine learning module 332 produces a dynamic digest about the user, the 
application, the context, and the input received from the user. The dynamic digest may 
list the inputs received by the machine learning module 332, any intermediate values 
processed (such as the difference between the typical heart rate and current heart rate of 

20 the user), and any determinations made (such as the user is angry based on an elevated 
heart rate and speech change or semantics indicating anger). The machine learning 
module 332 uses the dynamic digest to update the basic profile of the user. For example, 
if the dynamic digest indicates that the user has an elevated heart rate, the machine 
learning module 332 may so indicate in the current physiological profile section of the 

25 user's basic profile. The agent personalization module 334 and the dynamic adaptor 
module 336 may also use the dynamic digest. 

The agent personalization module 334 receives the basic profile of the user and 
the dynamic digest about the user from the machine learning module 332. Alternatively, 
the agent personalization module 334 may access the basic profile of the user or the 

30 dynamic digest about the user from the data storage device 150. The agent 

personalization module 334 creates a visual appearance and voice for an intelligent social 
agent (which may be referred to as casting the intelligent social agent) that may be 
appealing and appropriate for a particular user population and adapts the intelligent social 



9 



WO 03/073417 



PCT/US03/06218 



agent to fit the user and the user's changing circumstances as the intelligent social agent 
interacts with the user (which may be referred to as personalizing the intelligent social 
agent). 

The dynamic adaptor module 336 receives the adjusted basic profile of the user 
5 and the dynamic digest about the user from the machine learning module 332 and 

information received or compiled by the information extractor 320. The dynamic adaptor 
module 336 also receives casting and personalization information about the intelligent 
social agent from the agent personalization module 334. 

The dynamic adaptor module 336 determines the actions and behavior of the 

10 intelligent social agent. The dynamic adaptor module 336 may use verbal input from the 
user and the application program context to determine the one or more actions that the 
intelligent social agent should perform. For example, when the user enters a request to 
"check my email messages" and the email application program is not activated, the 
intelligent social agent activates the email application program and initiates the email 

15 application function to check email messages. The dynamic adaptor module 336 may use 
nonverbal information about the user and contextual information about the user and the 
application program to help ensure that the behaviors and actions of the intelligent social 
agent are appropriate for the context of the user. 

For example, when the machine learning module 332 indicates that the user's 

20 internal context is urgent, the dynamic adaptor module 336 may adjust the intelligent 

social agent so that the agent has a facial expression that looks serious and stops or pauses 
a non-critical function (such as receiving a large data file from a network) or closing 
unnecessary application programs (such as a drawing program) to accomplish a requested 
urgent action as quickly as possible. 

25 When the machine learning module 332 indicates that the user is fatigued, the 

dynamic adaptor module 336 may adjust the intelligent social agent so that the agent has 
a relaxed facial expression, speaks more slowly, and uses words with fewer syllables, and 
sentences with fewer words. 

When the machine learning module 332 indicates that the user is happy or 

30 energetic, the dynamic adaptor module 336 may adjust the intelligent social agent to have 
a happy facial expression and speak faster. The dynamic adaptor module 336 may have 
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the intelligent social agent to suggest additional purchases or upgrades when the user is 
placing an order using an electronic commerce application program. 

When the machine learning module 332 indicates that the user is frustrated, the 
dynamic adaptor module 336 may adjust the intelligent social agent to have a concerned 
5 facial expression and make fewer or only critical suggestions. If the machine learning 
module 332 indicates that the user is frustrated with the intelligent social agent, the 
dynamic adaptor module 336 may have the intelligent social agent apologize and explain 
sensibly what is the problem and how it should be fixed. 

The dynamic adaptor module 336 may adjust the intelligent social agent to behave 

1 0 based on the familiarity of the user with the current computer device, application 

program, or application program function and the complexity of the application program. 
For example, when the application program is complex and the user is not familiar with 
the application program (e.g., the user is using an application program for the first time or 
the user has not used the application program for some predetermined period of time), the 

1 5 dynamic adaptor module 336 may have the intelligent social agent ask the user whether 
the user would like help, and, if the user so indicates, the intelligent social agent starts a 
help function for the application program. When the application program is not complex 
or the user is familiar with the application program, the dynamic adaptor module 336 
typically does not have the intelligent social agent offer help to the user. 

20 The verbal generator 340 receives information from the adaptation engine 330 and 

produces verbal expressions for the intelligent social agent 350. The verbal generator 340 
may receive the appropriate verbal expression for the intelligent social agent from the 
dynamic adaptor module 336. The verbal generator 340 uses information from the 
machine learning module 332 to produce the specific content and linguistic style for the 

25 intelligent social agent 350. 

The verbal generator 340 then sends the textual verbal content to an I/O device for 
the computer device, typically a display device, or a text-to-speech generation program 
that converts the text to speech and sends the speech to a speech synthesizer. 

The affect generator 360 receives information from the adaptation engine 330 and 

30 produces the affective expression for the intelligent social agent 350. The affect generator 
360 produces facial expressions and vocal expressions for the intelligent social agent 350 
based on an indication from the dynamic adaptor module 336 as to what emotion the 
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intelligent social agent 350 should express. Aprocess for generating affect is described 
with respect to FIG. 5. 

Referring to FIG. 4A, a process 400A controls a processor to extract nonverbal 
information and determine the affective state of the user. The process 400 A is initiated by 
5 receiving physiological state data about the user (step 41 OA). Physiological state data 
may include autonomic data, such as heart rate, blood pressure, respiration rate, 
temperature, and skin conductivity. Physiological data may be determined using a device 
that attaches from the computer 110 to a user's finger or palm and is capable of detecting 
the heart rate, respiration rate, and blood pressure of the user. 

1 0 The processor then tentatively determines a hypothesis for the affective state of 

the user based on the physiological data received through the physiological channel (step 
415A). The processor may use predetermined decision logic that correlates particular 
physiological responses with an affective state. As described above with respect to FIG. 
3, an accelerated heart rate may be associated with fear or anger and a slow heart rate may 

1 5 indicate a relaxed state. 

The second channel of data received by the processor to determine the user's 
affective state is the vocal analysis data (step 420A), such as the pitch range, the volume, 
and the degree of breathiness in the speech of the user. For example, louder and faster 
speech compared to the user's basic pattern may indicate that a user is happy. Similarly, 

20 quieter and slower speech than normal may indicate that a user is sad. The processor then 
determines a hypothesis for the affective state of the user based on the vocal analysis data 
received through the vocal feature channel (step 425A). 

The third channel of data received by the processor for determining the user's 
affective state is the user's verbal content that reveals the user's emotions (step 430A). 

25 Examples of such verbal content include phrases such as "Wow, this is great" or "What? 
The file disappeared?". The processor then determines a hypothesis for the affective state 
of the user based on the verbal content received through the verbal channel (step 43 5 A). 

The processor then integrates the affective state hypotheses based on the data from 
the physiological channel, the vocal feature channel, and the verbal channel, resolves any 

30 conflict, and determines a conclusive affective state of the user (step 440A). Conflict 
resolution may be accomplished through predetermined decision logic. A confidence 
coefficient is given to the affective state predicted by each of the three channels based on 
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the inherent predictive power of that channel for that particular emotion and the 
unambiguity level of the specific diagnosis of the emotional state in occurrence. Then the 
processor disambiguates by comparing and integrating the confidence coefficients. 

Some implementations may receive either physiological data, vocal analysis data, 
5 verbal content, or a combination. When only one type of data is received, integration 
(step 440A) may not be performed. For example, when only physiological data is 
received, steps 420A-440A are not performed and the processor uses the affective state of 
the user based on physiological data as the affective state of the user. Similarly, when 
only vocal analysis data is received, the process is initiated when vocal analysis data is 

10 received and steps 410A, 415A, and 430A-445A are not performed. The processor uses 
the affective state of the user based on vocal analysis data as the affective state of the user. 

Similarly, referring to FIG. 4B, a process 400B controls a processor to extract 
nonverbal information and determine the affective state of the user. The processor 
receives physiological data about the user (step 41 OB), vocal analysis data (step 420B), 

15 and verbal content that indicates the emotion of the user (step 430B) and determines a 
hypothesis for the affective state of the user based on each type of data (steps 41 5B, 
425B, and 435B) in parallel. The processor then integrates the affective state hypotheses 
based on the data from the physiological channel, the vocal feature channel, and the 
verbal channel, resolves any conflict, and determines a conclusive affective state of the 

20 user (step 440B) as described with respect to FIG. 4A. 

Referring to FIG. 5, a process 500 controls a processor to adapt an intelligent 
social agent to the user and the context. The process 500 may help an intelligent social 
agent to act appropriately based on the user and the application context. 

The process 500 is initiated when content and contextual information is received 

25 (step 510) by the processor from an input/output device (such as a voice recognition and 
speech synthesis device, a video camera, or physiological detection device connected to a 
finger of the user) to the computer 1 1 0. The content and contextual information received 
may be verbal information, nonverbal information, or contextual information received 
from the user or application program or may be information compiled by an information 

30 extractor (as described previously with respect to FIG 3). 

The processor then accesses data storage device 150 to determine the basic user 
profile for the user with whom the intelligent social agent is interacting (step 515). The 
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basic user profile includes personal characteristics (such as name, age, gender, ethnicity 
or national origin information, and preferred language) about the user, professional 
characteristics about the user (such as occupation, position of employment, and one or 
more affiliated organizations), and non-verbal information about the user (such as 
5 linguistic style and physiological profile information). The basic user profile information 
may be received during a registration process for a product that hosts an intelligent social 
agent or by a casting process to create an intelligent social agent for a user and stored on 
the computing device. 

The processor may adjust the context and content information received based on 

1 0 the basic user profile information (step 520). For example, a verbal instruction to "read 
email messages now" may be received. Typically, a verbal instruction modified with the 
term "now" may result in a user context mode of "urgent." However, when the basic user 
profile information indicates that the user typically uses the term "now" as part of an 
instruction, the user context mode may be changed to "normal". 

1 5 The processor may adjust the content and context information received by 

determining the affective state of the user. The affective state of the user may be 
determined from content and context information (such as physiological data or vocal 
analysis data). 

The processor modifies the intelligent social agent based on the adjusted content 
20 and context information (step 525). For example, the processor may modify the linguistic 
style and speech style of the intelligent social agent to be more similar to the linguistic 
style and speech style of the user. 

The processor then performs essential actions in the application program (step 
530). For example, when the user enters a request to "check my email messages" and the 
25 email application program is not activated, the intelligent social agent activates the email 
application program and initiates the email application function to check email messages 
(as described previously with respect to FIG 3). 

The processor determines the appropriate verbal expression (step 535) and an 
appropriate emotional expression for the intelligent social agent (step 540) that may 
30 include a facial expression. 

The processor generates an appropriate verbal expression for the intelligent social 
agent (step 545). The appropriate verbal expression includes the appropriate verbal 
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content and appropriate emotional semantics based on the content and contextual 
information received, the basic user profile information, or a combination of the basic 
user profile information and the content and contextual information received. 

For example, words that have affective connotation may be used to match the 
5 appropriate emotion that the agent should express. This may be accomplished by using 
an electronic lexicon that associates a word with an affective state, such as associating the 
word "fantastic" with happiness, the word "delay" with frustration, and so on. The 
processor selects the word from the lexicon that is appropriate for the user and the 
context. Similarly, the processor may increase the number of words used in a verbal 

1 0 expression when the affective state of the user is happy or may decrease the number of 
words used or use words with fewer syllables if the affective state of the user is sad. 

The processor may send the verbal expression text to an I/O device for the 
computer device, typically a display device. The processor may convert the verbal 
expression text to speech and output the speech. This may be accomplished using a text- 

1 5 to-speech conversion program and a speech synthesizer. 

In the meantime, the processor generates an appropriate affect for the facial 
expression of the intelligent social agent (step 550). Otherwise, a default facial 
expression may be selected. A default facial expression may be determined by the 
application, the role of the agent, and the target user population. In general, an intelligent 

20 social agent by default may be slightly friendly, smiling, and pleasant. 

Facial emotional expressions may be accomplished by modifying portions of the 
face of the intelligent social agent to show affect. For example, surprise may be indicated 
by showing the eyebrows raised (e.g., curved and high), skin below brow stretched 
horizontally, wrinkles across forehead, eyelids opened, and the white of the eye is visible, 

25 jaw open without tension or stretching of the mouth. 

Fear may be indicated by showing the eyebrows raised and drawn together, 
forehead wrinkles drawn to the center of the forehead, upper eyelid is raised and lower 
eyelid is drawn up, mouth open, and lips slightly tense or stretched and drawn back. 
Disgust may be indicated by showing upper lip is raised, lower lip is raised and pushed up 

30 to upper lip or lower lip is lowered, nose is wrinkled, cheeks are raised, lines appear 

below the lower lid, lid is pushed up but not tense, and brows are lowered. Anger may be 
indicated by eyebrows lowered and drawn together, vertical lines between eyebrows, 
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lower lid is tensed, upper lid is tense, eyes have a hard stare, and eyes have a bulging 
appearance, lips are either pressed firmly together or tensed in a square shape, nostrils 
may be dilated. Happiness may be indicated by the corners of the lips being drawn back 
and up, a wrinkle is shown from the nose to the outer edge beyond the lip corners, cheeks 
5 are raised, lower eyelid shows wrinkles below it, lower eyelid may be raised but not 
tense, and crow's-feet wrinkles go outward from the outer corners of the eyes. Sadness 
may be indicated by drawing the inner corners of eyebrows up, triangulating the skin 
below the eyebrow, the inner corner of the upper lid and upper corner is raised, and 
corners of the lips are drawn or lip is trembling. 

1 0 The processor then generates the appropriate affect for the verbal expression of 

the intelligent social agent (step 555). This may be accomplished by modifying the 
speech style from the baseline style of speech for the intelligent social agent. Speech 
style may include speech rate, pitch average, pitch range, intensity, voice quality, pitch 
changes, and level of articulation. For example, a vocal expression may indicate fear 

1 5 when the speech rate is much faster, the pitch average is very much higher, the pitch 

range is much wider, the intensity of speech normal, the voice quality irregular, the pitch 
change is normal, and the articulation precise. Speech style modifications that may 
connote a particular affective state are set forth in the table below and are further 
described in Murray, I. R., & Arnott, J. L. (1993), Toward the simulation of emotion in 

20 synthetic speech: A review of the literature on human vocal emotion, Journal of 
Acoustical Society of America, 93, 1097-1108. 



16 



WO 03/073417 



PCT7US03/06218 





Fear 




Sadness 


Happiness 


Disgust 


Speech Rate 


Much 
Faster 


Slightly 


Slightly 
Slower 


Faster Or 


Very Much Slower 


Pitch Average 


Very 
Much 
Higher 


Very Much 
Higher 


Slightly 
Lower 


Much Higher 


Very Much Lower 


Pitch Range 


Much 
Wider 


Much 
Wider 


Slightly 
Narrower 


Much Wider 


Slightly Wider 


Intensity 


Normal 


Higher 


Lower 


Higher 


Lower 


Voice Quality 


Irregular 
Voicing 


Breathy 
Chest Tone 


Resonant 


Breathy Blaring 


Grumbled Chest Tone 


Pitch Changes 


Normal 


Abrupt On 

Stressed 

Syllables 


Downward 
Inflections 


Smooth 
Upward 
Inflections 


Wide Downward 
Terminal Inflections 


Articulation 


Precise 


Tense 


Slurring 


Normal 


Normal 



Referring to FIG. 6, a process 600 controls a processor to create an intelligent 
social agent for a target user population. This process (which may be referred to as 
5 casting an intelligent social agent) may produce an intelligent social agent whose 
appearance and voice are appealing and appropriate for the target users. 

The process 600 begins with the processor accessing user information stored in 
the basic user profile (step 605). The user information stored within the basic user profile 
may include personal characteristics (such as name, age, gender, ethnicity or national 
10 origin information, and preferred language) about the user and professional characteristics 
about the user (such as occupation, position of employment, and one or more affiliated 
organizations). 

The processor receives information about the role of the intelligent social agent 
for one or more particular application programs (step 610). For example, the intelligent 
1 5 social agent may be used as a help agent to provide functional help information about an 
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application program or may be used as an entertainment player in a game application 
program. 

The processor then applies an appeal rule to further analyze the basic user profile 
and to select a visual appearance for the intelligent social agent that may be appealing to 
5 the target user population (step 620). The processor may apply decision logic that 

associates a particular visual appearance for an intelligent social agent with particular age 
groups, occupations, gender, or ethnic or cultural groups. For example, decision logic 
may be based on similarity-attraction (that is, matching the ages, personalities, and 
ethnical identities of the intelligent social agent and the user). A professional-looking 

10 talking-head may be more appropriate for an executive user (such as a chief executive 
officer or a chief financial officer), and a talking-head with an ultra-modern hair style 
may be more appealing to an artist. 

The processor applies an appropriateness rule to further analyze the basic user 
profile and to modify the casting of the intelligent social agent (step 630). For example, 

1 5 a male intelligent social agent may be more suitable for technical subject matter, and a 
female intelligent social agent may be more appropriate for fashion and cosmetics subject 
matter. 

The processor then presents the visual appearance for the intelligent social agent 
to the user (step 640). Some implementations may allow the user to modify attributes 

20 (such as the hair color, eye color, and skin color) of the intelligent social agent or select 
from among several intelligent social agents with different visual appearances. Some 
implementations also may allow a user to import a graphical drawing or image to use as 
the visual appearance for the intelligent social agent. 

The processor applies the appeal rule to the stored basic user profile (step 650) 

25 and the appropriateness rule to the stored basic user profile to select a voice for the 
intelligent social agent (step 660). The voice should be appealing to the user and be 
appropriate for the gender represented by the visual intelligent social agent (e.g., an 
intelligent social agent with a male visual appearance has a male voice and an intelligent 
social agent with a female visual appearance has a female voice). The processor may 

30 match the user's speech style characteristics (such as speech rate, pitch average, pitch 
range, and articulation) as appropriate for the voice of the intelligent social agent. 
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The processor presents the voice choice for the intelligent social agent (step 670). 
Some implementations may allow the user to modify the speech characteristics for the 
intelligent social agent. 

The processor then associates the intelligent social agent with the particular user 
5 (step 680). For example, the processor may associate an intelligent social agent identifier 
with the intelligent social agent, store the intelligent social agent identifier and 
characteristics of the intelligent social agent in the data storage device 150 of the 
computer 1 10 and store the intelligent social agent identifier with the basic user profile. 
Some implementations may cast one or more intelligent social agents to be appropriate 

10 for a group of users that have similar personal or professional characteristics. 

Referring to FIG. 7, an implementation of an intelligent social agent is an 
intelligent personal assistant. The intelligent personal assistant interacts with a user of the 
computing device such as computing device 210 to assist the user in operating the 
computing device 210 and using application programs. The intelligent personal assistant 

1 5 assists the user of the computing device to manage personal information, operate the 
computing device 210 or one or more application programs running on the computing 
device, and use the computing device for entertainment. 

The intelligent personal assistant may operate on a mobile computing device, such 
as a PDA, laptop, or mobile phone, or a hybrid device including the functions associated 

20 with a PDA, laptop, or mobile phone. When an intelligent personal assistant operates on 
a mobile computing device, the intelligent personal assistant may be referred to as an 
intelligent mobile personal assistant. The intelligent personal assistant also may operate 
on a stationary computing device, such as a desktop personal computer or workstation, 
and may operate on a system of networked computing devices, as described with respect 

25 to FIG. 1. 

FIG. 7 illustrates one implementation of an architecture 700 for an intelligent 
personal assistant 730. Application program 710, including a personal information 
management application program 715, one or more entertainment application programs 
720, and/or one or more application programs to operate the computing device 725, may 
30 run on a computing device, as described with respect to FIG. 1 . 

The intelligent personal assistant 730 uses the social intelligence engine 735 to 
interact with a user 740 and the application programs 710. Social intelligence engine 735 
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is substantially similar to social intelligence engine 300 of FIG. 3. The information 
extractor 745 of the intelligent personal assistant 730 receives information from and about 
the application programs 710 and information from and about the user 740, in a similar 
manner as described with respect to FIG. 3. 
5 The intelligent personal assistant 730 processes the extracted information using an 

adaptation engine 750 and then generates one or more responses (including verbal content 
and facial expressions) to interact with the user 740 using by the verbal generator 755 and 
the affect generator 760, in a similar manner as described with respect to FIG. 3. The 
intelligent personal assistant 730 also may produce one or more responses to operate one 

10 or more of the application programs 710 running on the computing device 210, as 

described with respect to FIGS. 2-3 and FIGS. 8-10. The responses produced may enable 
the intelligent personal assistant 730 to appear appealing, affective, adaptive, and 
appropriate when interacting with the user 740. The user 740 also interacts with one or 
more of the applications programs 710. 

1 5 FIG. 8 illustrates an architecture 800 for implementing an intelligent personal 

assistant that helps a user to manage personal information. The intelligent personal 
assistant 810 may assist the user 815 as an assistant that works across all personal 
information management application program functions. For a business user using a 
mobile computing device, the intelligent personal assistant 810 may be able to function as 

20 an administrative assistant in helping the user manage appointments, email messages, and 
contact lists. As similarly described with respect to FIGS. 3 and 7, the intelligent 
personal assistant 810 interacts with the user 815 and the personal information 
management application program 820 using the social intelligence engine 825, that also 
includes an information extractor 830, an adaptation engine 835, a verbal generator 840, 

25 and an affect generator 845. 

The personal information management application program 820 (which also may 
be referred to as a PIM) includes email functions 850, calendar functions 855, contact 
management functions 860, and task list functions 865 (which also may be referred to as a 
"to do" list). The personal information management application program may be, for 

30 example, a version of Microsoft® Outlook®, such as Pocket Outlook®, by Microsoft 
Corporation, that operates on a PDA. 

The intelligent personal assistant 810 may interact with the user 815 concerning 
email functions 850. For example, the intelligent personal assistant 810 may report the 
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status of the user's email account, such as the number of unread messages or the number 
of unread messages having an urgent status, at the beginning of a work day or when the 
user requests such an action. The intelligent personal assistant 810 may communicate 
with the user 815 with a more intense affect about unread messages having an urgent 
5 status, or when the number of unread messages is higher than typical for the user 815 
(based on intelligent and/or statistical monitoring of typical e-mail patterns). The 
intelligent personal assistant 8 1 0 may notify the user 8 1 5 of recently received messages 
and may communicate with a more intense affect when a recently received message has 
an urgent status. The intelligent personal assistant 810 may help the user manage 

10 messages, such as suggesting messages be deleted or archived based on the user's typical 
message deletion or archival patterns or when the storage space for messages is reaching 
or exceeding its limit, or suggesting messages be forwarded to particular users or groups 
of users based on the user's typical message forwarding patterns. 

The intelligent personal assistant 810 may help the user 815 manage the user's 

15 calendar 850. For example, the intelligent personal assistant 810 can report to the user 
his/her upcoming appointments for the day in the morning or at any time the user desires. 
The intelligent personal assistant 810 may remind the user 815 of upcoming 
appointments at a time desired by the user and also decide how far the location of the 
appointment is from the user's current location. If the user is late or seems late for an 

20 appointment, the intelligent personal assistant 8 10 will accordingly remind him/her in an 
urgent manner such as speaking a little louder and appearing a little concerned. For 
example, when a user does not need to travel to an upcoming appointment, such as a 
business meeting at the office in which the user is located, and the appointment is a 
regular one in terms of significance and urgency, the intelligent personal assistant 810 

25 may remind the user 8 1 5 of the appointment in a neutral affect with regular voice tone 

and facial expression. As the time approaches for an upcoming appointment that requires 
the user to leave the premises to travel to the appointment, the intelligent personal 
assistant 810 may remind the user 815 of the appointment in a voice with a higher volume 
and with more urgent affect. 

30 The intelligent personal assistant 810 may help the user 815 enter an appointment 

in the calendar. For example, the user 81 5 may verbally describe the appointment using 
general or relative terms. The intelligent personal assistant 810 transforms the general 
description of the appointment into information that can be entered into the calendar 



21 



WO 03/073417 



PCT/US03/06218 



application program 860 and sends a command to enter the information into the calendar. 
For example, the user may say "I have an appointment with Dr. Brown next Thursday at 
1." Using the social intelligence engine 825, the intelligent personal assistant 810 may 
generate the appropriate commands to the calendar application program 860 to enter an 
5 appointment in the user's calendar. For example, the intelligent personal assistant 810 
may understand that Dr. Brown is the user's physician (possibly by performing a search 
within the contacts database 860) and that the user will have to travel to the physician's 
office. The intelligent personal assistant 810 also may look up the address using contact 
information in the contact management application program 860, and may use a mapping 

10 application program to estimate the time required to travel from the user's office address 
to the doctor's office, and determine the date that corresponds to "next Thursday". The 
intelligent personal assistant 810 then sends commands to the calendar application 
program to enter the appointment at 1 :00 pm on the appropriate date and to generate a 
reminder message for a sufficient time before the appointment that allows the user time to 

15 travel to the doctor's office. 

The intelligent personal assistant 810 also may help the user 815 manage the 
user's contacts 860. For example, the intelligent personal assistant 810 may enter 
information for a new contact that the user 815 has spoken to the intelligent personal 
assistant 810. For example, the user 815 may say "My new doctor is Dr. Brown in 

20 Oakdale." The intelligent personal assistant 810 looks up the full name, address, and 

telephone number of Dr. Brown by using a web site of the user's insurance company that 
lists the doctors that accept payment from the user's insurance carrier. The intelligent 
personal assistant 810 then sends commands to the contact application program 860 to 
enter the contact information. The intelligent personal assistant 810 may help organize 

25 the contact list by entering new contacts that cross-reference contacts entered by the user 
815, such as entering the contact information for Dr. Brown also under "Physician". 

The intelligent personal assistant 810 may help the user 815 manage the user's 
task list application 865. For example, the intelligent personal assistant 810 may enter 
information for a new task, read the task list to the user when the user may not be able to 

30 view the text display of the computing device, such as when the user is driving an 

automobile, and remind the user of tasks that are due in the near future. The intelligent 
personal assistant 810 may remind the user 815 of a task with a higher importance rating 
that is due in the near future using a voice with a higher volume and more urgent affect. 
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Some personal information management application programs may include voice 
mail and phone call functions (not shown). The intelligent personal assistant 810 may 
help manage the voice mail messages received by the user 815, such as by playing 
messages, saving messages, or reporting the status of messages (e.g., how many new 
5 messages have been received). The intelligent personal assistant 810 may remind the user 
815 that a new message has not been played using a voice with higher volume and more 
urgent affect when more time has passed than typical for the user to check his voice mail 
messages. 

The intelligent personal assistant 810 may help the user manage the user's phone 
10 calls. The intelligent personal assistant 810 may act as if the intelligent personal assistant 
810 is a virtual secretary for the user 815 by receiving and selectively processing received 
phone calls. For example, when the user is busy and does not want to receive phone calls, 
the intelligent personal assistant 810 may not notify the user about an incoming call. The 
intelligent personal assistant 810 may selectively notify the user about incoming phone 
15 calls based on a priority scheme in which the user specifies a list of people from whom 
the user will speak with if a phone call is received, or will speak with if a phone call is 
received under particular conditions specified by the user, for example, even when the 
user is busy. 

The intelligent personal assistant 810 also may be able to organize and present 
20 news to the user 815. The intelligent personal assistant 8 1 0 may use news sources and 
categories of news based on the user's typical patterns. Additionally or alternatively, the 
user 81 5 may select news sources and categories for the intelligent personal assistant 810 
to use. 

The user 815 may select the modality through which the intelligent personal 
25 assistant 810 produces output, such as whether the intelligent personal assistant produces 
only speech output, only text output on a display, or both speech and text output. The 
user 815 may indicate by using speech input or clicking a mute button that the intelligent 
personal assistant 810 is only to use text output. 

FIG. 9 illustrates an architecture 900 of an intelligent personal assistant helping a 
30 user to operate applications in a computing device. The intelligent personal assistant 910 
may assist the user 915 across various application programs or functions. As described 
with respect to FIGS. 3 and 7, intelligent personal assistant 910 interacts with the user 
915 and the application programs 920 in a computing device, including basic functions 
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relating to the device itself and applications running on the device such as enterprise 
applications. The intelligent personal assistant 910 similarly uses the social intelligence 
engine 945 including an information extractor 950, an adaptation engine 955, a verbal 
generator 960, and an affect generator 965. 
5 Some example of basic functions relating to a computing device itself are 

checking battery status 925, opening or closing an application program 930, 935, and 
synchronizing data 940, among many other functions. The intelligent personal assistant 
910 may interact with the user 915 concerning the status of the battery 925 in the 
computing device. For example, the intelligent personal assistant 910 may report that the 

10 battery is running low when the battery is running lower than ten percent (or other user 
defined threshold) of the battery's capacity. The intelligent personal assistant 910 may 
make suggestions, such as dimming the screen or closing some applications, and send the 
commands to accomplish those functions when the user 915 accepts the suggestions. 
The intelligent personal assistant 910 may interact with the user 915 to switch 

1 5 applications by using an open application program 930 function and a close application 
program 935 function. For example, the intelligent personal assistant 910 may close a 
particular spreadsheet file and open a particular word processing document when the user 
indicates that a particular word processing document should be opened because the user 
typically closes the particular spreadsheet file when opening the particular word 

20 processing document. 

The intelligent personal assistant 910 may interact with the user to synchronize 
data 940 between two computing devices. For example, the intelligent personal assistant 
910 may send commands to copy personal management information from a portable 
computing device, such as a PDA, to a desktop computing device. The user 915 may 

25 request that the devices be synchronized without specifying what information is to be 
synchronized. The intelligent personal assistant 910 may synchronize appropriate 
personal management information based on the user's typical pattern of keeping contact 
and task list information synchronized on the desktop but not copying appointment 
information that resides only in the PDA. 

30 Beyond the basic functions for operating a computing device itself, the intelligent 

personal assistant 910 can help a user operate a wide range of applications running on the 
computing device. Examples of enterprise applications for an intelligent personal 
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assistant 901 are business reports, budget management, project management, 
manufacturing monitoring, inventory control, purchase, sales, learning and training. 

On mobile enterprise portals, an intelligent personal assistant 910 can provide 
tremendous assistance to the user 915 by prioritizing and pushing out important and 
5 urgent information. The context-defining method for applications in the intelligent social 
agent architecture guides the intelligent personal assistant 910 in this matter. For 
example, the intelligent personal assistant 910 can push out the alerts of sales drop in top 
priority either by displaying it on the screen or saying it to the user. The intelligent 
personal assistant 910 adapts its verbal style to make it straightforward and concise, 

10 speaks a little faster, and appears concerned such as with slight frowning in the case of 
sales-drop alert. The intelligent personal assistant 910 can present the business reports 
such as sales reports, acquisition reports and project status such as a production timeline 
to the user through speech or graphical display. The intelligent personal assistant 910 
would push out or mark any emergent or serious problems in these matters. The 

1 5 intelligent personal assistant 9 1 0 may present approval requests to the managers in a 
simple and straightforward method so that the user can immediately grasp the most 
critical information instead of taking numerous steps to dig out the information by 
him/herself. 

FIG. 10 illustrates an architecture 1000 of an intelligent personal assistant helping 
20 a user to use a computing device for entertainment. Using the intelligent personal 
assistant for entertainment may increase the user's willingness to interact with the 
intelligent personal assistant for non-entertainment applications. The intelligent personal 
assistant 1010 may assist the user 1015 across various entertainment application 
programs. As described with respect to FIGS. 3 and 7, intelligent personal assistant 1010 
25 interacts with the user 1015 and the computing device entertainment programs 1020, such 
as by participating in games, providing narrative entertainment, and performing as an 
entertainer. The intelligent personal assistant 1010 similarly uses the social intelligence 
engine 1030, including an information extractor 1035, an adaptation engine 1040, a 
verbal generator 1045, and an affect generator 1050. 
30 The intelligent personal assistant 1010 may interact with the user 101 5 by 

participating in computing device-based games. For example, the intelligent personal 
assistant 1010 may act as a participant when playing a game with the user, for example, a 
card game or other computing device-based game, such as an animated car racing game 
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or chess game. The intelligent personal assistant 1010 may interact with the user in a 
more exaggerated manner when helping the user 1015 use the computing device for 
entertainment than when helping the user with non-entertainment application programs. 
For example, the intelligent personal assistant 1010 may speak louder, use colloquial 
5 expressions, laugh, move its eyebrows up and down often, and open its eyes widely when 
playing a game with the user. When the user wins a competitive game against the 
intelligent personal assistant 1010, the intelligent personal assistant may praise the user 
1015, or when the user loses to the intelligent personal assistant, the intelligent personal 
assistant may console the user, compliment the user, or discuss how to improve. 

1 0 The intelligent personal assistant 1010 may act as an entertainment companion by 

providing narrative entertainment, such as by reading stories or re-narrating sporting 
events to the user while the user is driving an automobile or telling jokes to the user when 
the user is bored or tired. The intelligent personal assistant 1 01 0 may perform as an 
entertainer, such as by appearing to sing music lyncs (which may be referred to as "lip- 

1 5 synching") or, when an intelligent personal assistant 1 01 0 is represented as a full-bodied 
agent, dancing to music to entertain. 

Implementations may include a method or process, an apparatus or system, or 
computer software on a computer medium. It will be understood that various 
modifications may be made without departing from the spirit and scope of the following 

20 claims. For example, advantageous results still could be achieved if steps of the disclosed 
techniques were performed in a different order and/or if components in the disclosed 
systems were combined in a different manner and/or replaced or supplemented by other 
components. 
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WHAT IS CLAIMED IS: 

1 . A computer-implemented method for implementing an intelligent personal 
assistant comprising: 

receiving an input associated with a user and an input associated with an 
application program; 
5 accessing a user profile associated with the user; 

extracting context information from the received input; and 
processing the context information and the user profile to produce an adaptive 
response by the intelligent personal assistant. 

2. The method of claim 1 wherein: 

10 the application program is a personal information management application 

program, and 

the adaptive response by the intelligent personal assistant is associated with the 
personal information management application program. 

3. The method of claim 1 wherein: 

1 5 the application program is an application program to operate a computing device, 

and 

the adaptive response by the intelligent personal assistant is associated with 
operating the computing device. 

4. The method of claim 1 wherein: 

20 the application program is an entertainment application program, and 

the adaptive response by the intelligent personal assistant is associated with the 
entertainment application program. 

5. The method of claim 4 wherein: 

the entertainment application program is a game, and 
25 the adaptive response by the intelligent personal assistant is associated with the 

game. 
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6. A computer-readable medium or propagated signal having embodied 
thereon a computer program configured to implement an intelligent personal assistant, the 
medium comprising a code segment configured to: 

receive an input associated with a user and an input associated with an application 
5 program; 

access a user profile associated with the user; 

extract context information from the received input; and 

process the context information and the user profile to produce an adaptive 
response by the intelligent personal assistant. 
1 0 7. The medium of claim 6 wherein: 

the application program is a personal information management application 
program, and 

the adaptive response by the intelligent personal assistant is associated with the 
personal information management application program. 
15 8 . The medium of claim 6 wherein: 

the application program is an application program to operate a computing device, 

and 

the adaptive response by the intelligent personal assistant is associated with 
operating the computing device. 
20 9. The medium of claim 6 wherein: 

the application program is an entertainment application program, and 
the adaptive response by the intelligent personal assistant is associated with the 
entertainment application program. 

1 0. The medium of claim 9 wherein: 
25 the entertainment application program is a game, and 

the adaptive response by the intelligent personal assistant is associated with the 

game. 
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11. A system for implementing a intelligent personal assistant, the system 
comprising a processor connected to a storage device and one or more input/output 
devices, wherein the processor is configured to: 

receive an input associated with a user and an input associated with an application 
5 program; 

access a user profile associated with the user; 

extract context information from the received input; and 

process the context information and the user profile to produce an adaptive 
response hy the intelligent personal assistant. 
10 12. The system of claim 1 1 wherein: 

the application program is a personal information management application 
program, and 

the adaptive response by the intelligent personal assistant is associated with the 
personal information management application program. 
15 13. The system of claim 1 1 wherein: 

the application program is an application program to operate a computing device, 

and 

the adaptive response by the intelligent personal assistant is associated with 
operating the computing device. 
20 14. The system of claim 1 1 wherein: 

the application program is an entertainment application program, and 
the adaptive response by the intelligent personal assistant is associated with the 
entertainment application program. 

1 5 . The system of claim 14 wherein: 
25 the entertainment application program is a game, and 

the adaptive response by the intelligent personal assistant is associated with the 

game. 
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16. An apparatus for implementing an intelligent social agent, the apparatus 
comprising: 

an information extractor configured to: 

access a user profile associated with the user, 
5 receive an input associated with a user, and 

extract context information from the received input; 
an adaptation engine configured to: 

receive the context information and the user profile from the information 
extractor, and process the context information and the user profile to produce an 
10 adaptive output; and 

an output generator configured to: 

receive the adaptive output from the adaptation engine, and represent the 
adaptive output in the intelligent social agent. 

17. The apparatus of claim 16 wherein the input is physiological data 

1 5 associated with the user and the information extractor is configured to receive the 
physiological data. 

18. The apparatus of claim 16 wherein the input is application program 
information associated with the user and the information extractor is configured to receive 
application program information associated with the user. 

20 19. The apparatus of claim 16 wherein the information extractor is further 

configured to extract information about an affective state of the user from the received 
input. 

20. The apparatus of claim 1 9 wherein the information extractor is configured 
to extract information about an affective state of the user based on physiological 

25 information associated with the user. 

21 . The apparatus of claim 19 wherein the information extractor configured to 
extract information about an affective state of the user is configured to extract 
information about an affective state of the user based on vocal analysis information 
associated with the user by extracting verbal content and analyzing speech characteristics 

30 of the user. 
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22. The apparatus of claim 1 9 wherein the information extractor configured to 
extract information about an affective state of the user is configured to extract 
information about an affective state of the user based on verbal information from the 
received input. 

23. The apparatus of claim 16 wherein the information extractor configured to 
extract context information is configured to extract a geographical position of the user by 
using a global positioning system. 

24. The apparatus of claim 23 wherein the information extractor configured to 
extract context information is configured to extract information based on the geographical 
position of the user. 

25. The apparatus of claim 16 wherein the information extractor configured to 
extract context information is configured to extract information about the application 
content associated with the user. 

26. The apparatus of claim 16 wherein the information extractor configured to 
extract context information is configured to extract information about a linguistic style of 
the user from the received input. 

27. The apparatus of claim 16 wherein: 
the output generator is a verbal generator; 

the adaptation engine configured to produce an adaptive output is configured to 
produce a verbal expression; and 

the verbal generator produces the verbal expression in the intelligent social agent. 

28. The apparatus of claim 16 wherein: 
the generator is an affect generator; 

the adaptation engine configured to produce an adaptive output is configured to 
produce a facial expression; and 

the affect generator represents the facial expression in the intelligent social agent. 
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29. The apparatus of claim 16 wherein the output generator is a multi-modal 
output generator that represents the adaptive output in the intelligent social agent using at 
least one of a first mode and a second mode. 

30. The apparatus of claim 29 wherein: 
5 the first mode is a verbal mode; 

the second mode is an affect mode; 

the adaptive engine configured to produce an adaptive output is configured to: 
produce a facial expression, and 
produce an verbal expression; and 
1 0 the multi-modal output generator represents the facial expression and the verbal 

expression in the intelligent social agent. 

31. The apparatus of claim 16 wherein: 

the adaptation engine is further configured to produce an emotional expression to 
be represented by the intelligent social agent; and 
1 5 the output generator is configured to represent the emotional expression in the 

intelligent social agent. 

32. A mobile device for implementing an intelligent social agent that interacts 
with a user, the mobile device comprising: 

a processor connected to a memory and one or more input/output devices; 
20 a social intelligence engine configured to interact with the processor, the social 

intelligence engine including: 

an information extractor configured to: 

access a user profile associated with the user, 
receive an input associated with a user, and 
25 extract context information from the received input; 

an adaptation engine configured to: 

receive the context information and the user profile from the 
information extractor, and process the context information and the user profile to produce 
an adaptive output; and 
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an output generator configured to: 

receive the adaptive output from the adaptation engine, and 
represent the adaptive output in the intelligent social agent. 

33. The mobile device of claim 32 wherein the input is physiological data 
5 associated with the user and the information extractor is configured to receive the 

physiological data. 

34. The mobile device of claim 32 wherein the input is application program 
information associated with the user and the information extractor is configured to receive 
the application program information. 

10 35. The mobile device of claim 32 wherein the information extractor is further 

configured to extract information about an affective state of the user from the received 
input. 

36. The mobile device of claim 35 wherein the information extractor is 
configured to extract information about an affective state of the user based on 

1 5 physiological information associated with the user. 

37. The mobile device of claim 35 wherein the information extractor 
configured to extract information about an affective state of the user is configured to 
extract information about an affective state of the user based on vocal analysis 
information associated with the user by extracting verbal content and analyzing speech 

20 characteristics of the user from the received input. 

38. The mobile device of claim 35 wherein the information extractor 
configured to extract information about an affective state of the user is configured to 
extract information about an affective state of the user based on verbal information from 
the received input. 

25 39. The mobile device of claim 32 wherein the information extractor 

configured to extract context information is configured to extract a geographical position 
of the user by using a global positioning system. 

40. The mobile device of claim 35 wherein the information extractor 
configured to extract context information is configured to extract information based on 

30 the geographical position of the user. 
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41. The mobile device of claim 32 wherein information extractor configured to 
extract context information is configured to extract information about the application 
content associated with the user. 

42. The mobile device of claim 32 wherein information extractor configured to 
5 extract context information is configured to extract information about a linguistic style of 

the user from the received input. 

43 . The mobile device of claim 32 wherein : 
the output generator is a verbal generator; 

the adaptation engine configured to produce an adaptive output is configured to 
10 produce a verbal expression; and 

the verbal generator produces the verbal expression in the intelligent social agent. 

44. The mobile device of claim 32 wherein: 
the generator is an affect generator; 

the adaptation engine configured to produce an adaptive output is configured to 
15 produce a facial expression; and 

the affect generator represents the facial expression in the intelligent social agent. 

45. The mobile device of claim 32 wherein the output generator is a multi- 
modal output generator that represents the adaptive output in the intelligent social agent 
using at least one of a first mode and a second mode. 

20 46. The mobile device of claim 45 wherein: 

the first mode is a verbal mode; 
the second mode is an affect mode; 

the adaptive engine configured to produce an adaptive output is configured to: 
produce a facial expression, and 
25 produce an verbal expression; and 

the multi-modal output generator represents the facial expression and the verbal 
expression in the intelligent social agent. 
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47. The mobile device of claim 32 wherein: 

the adaptation engine is further configured to produce an emotional expression to 
be represented by the intelligent social agent; and 

the output generator is configured to represent the emotional expression in the 
5 intelligent social agent. 

48. A method for implementing an intelligent social agent, the method 
comprising: 

receiving an input associated with a user; 
accessing a user profile associated with the user; 
1 0 extracting context information from the received input; and 

processing the context information and the user profile to produce an adaptive 
output to be represented by the intelligent social agent. 

49. The method of claim 48 wherein the input associated with the user 
comprises physiological data associated with the user. 

15 50. The method of claim 48 wherein the input associated with the user 

comprises application program information associated with the user. 

51. The method of claim 48 wherein extracting context information comprises 
extracting information about an affective slate of the user. 

52. The method of claim 51 wherein extracting information about an affective 
20 state of the user is based on physiological information associated with the user. 

53. The method of claim 5 1 wherein extracting information about an affective 
state of the user is based on vocal analysis information associated with the user. 

54. The method of claim 51 wherein extracting information about an affective 
state of the user is based on verbal information from the user. 

25 55. The method of claim 48 wherein extracting context information comprises 

extracting a geographical position of the user. 

56. The method of claim 55 wherein extracting context information comprises 
extracting information based on the geographical position of the user. 
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57. The method of claim 48 wherein extracting context information comprises 
extracting information about the application content associated with the user. 

58. The method of claim 48 wherein extracting context information comprises 
extracting information about a linguistic style of the user. 

5 59. The method of claim 48 wherein the adaptive output comprises a verbal 

expression to be represented by the intelligent social agent. 

60. The method of claim 48 wherein the adaptive output comprises a facial 
expression to be represented by the intelligent social agent. 

61 . The method of claim 48 wherein an adaptive output comprises an 
10 emotional expression to be represented by the intelligent social agent. 



36 



WO 03/073417 



1/11 



PCT/US03/06218 




WO 03/073417 



2/11 



PCT/US03/06218 




FIG. 2 



WO 03/073417 



PCT/US03/06218 



300 

( 7ZZ > ( APPLICATION 

i n 



VERBAL 
EXTRACTOR 




NON-VERBAL 
EXTRACTOR 




USER CONTEXT 
EXTRACTOR 


322-^ 




324-^ | 


326-^ ' 










APPLICATION 

CONTEXT 
EXTRACTOR 





















INFORMATION EXTRACTOR 

V-320 





334--. 








MACHINE 




AGENT 




DYNAMIC 






LEARNING 




PERSONALIZATION 




ADAPTATOR 




v 33 2 t: 












ADAPTATION ENGINE 





L- 330 









360 
J 


VERBAL 




AFFECT 


GENERATOR 




GENERATOR 





350 



INTELLIGENT 
SOCIAL AGENT 

V 



FIG. 3 



WO 03/073417 



PCT7US03/06218 



400A 



RECEIVE PHYSIOLOGICAL DATA 
ABOUT THE USER 



41 5A 

DETERMINE HYPTOTHESIS FOR J 

AFFECTIVE STATE OF USER 
BASED ON PHYSIOLOGICAL DATA 



RECEIVE VOCAL ANALYSIS DATA 



420A 

J 



DETERMINE HYPOTHESIS FOR 
AFFECTIVE STATE OF USER 
BASED ON VOCAL ANALYSIS 



RECEIVE VERBAL CONTENT 



43J3A 



DETERMINE HYPOTHESIS FOR 
AFFECTIVE STATE OF USER 
BASED ON VERBAL CONTENT 



INTEGRATE AFFECTIVE STATE "OA 
HYPOTHESES AND DETERMINE -S 
AFFECTIVE STATE OF USER 



FIG.4A 



WO 03/073417 



PCT/US03/06218 



400B 



RECEIVE PHYSIOLOGICAL DATA 
ABOUT THE USER 



DETERMINE HYPTOTHESIS FOR 

AFFECTIVE STATE OF USER 
BASED ON PHYSIOLOGICAL DATA 



41 5B 

V 



RECEIVE VOCAL ANALYSIS DATA 



420B 

J 



DETERMINE HYPOTHESIS FOR 
AFFECTIVE STATE OF USER 
BASED ON VOCAL ANALYSIS 



RECEIVE VERBAL CONTENT 



J 



DETERMINE HYPOTHESIS FOR 
AFFECTIVE STATE OF USER 
BASED ON VERBAL CONTENT 



INTEGRATE AFFECTIVE STATE 
HYPOTHESES AND DETERMINE 
AFFECTIVE STATE OF USER 



440B 



FIG. 4B 



WO 03/073417 PCT/US03/06218 
6/11 



500 

51 0-^ 



RECEIVE CONTENT AND CONTEXT INFORMATION 



ACCESS BASIC USER PROFILE 



ADJUST CONTENT AND CONTEXT INFORMATION BASED ON USER PROFILE INFORMATION 



530— 



PERFORM ESSENTIAL ACTIONS IN APPLICATION PROGRAM 



535^ 



DETERMINE APPROPRIATE VERBAL EXPRESSION FOR INTELLIGENT SOCIAL AGENT 



DETERMINE APPROPRIATE EMOTIONAL EXPRESSION FOR INTELLIGENT SOCIAL AGENT 



GENERATE APPROPRIATE VERBAL EXPRESSIONFOR INTELLIGENT SOCIAL AGENT 



GENERATE APPROPRIATE AFFECT FOR FACIAL EXPRESSION OF INTELLIGENT SOCIAL AGENT 



GENERATE APPROPRIATE AFFECT FOR VOCAL EXPRESSION OF INTELLIGENT SOCIAL AGENT 



FIG. 5 



WO 03/073417 



PCT7US03/06218 



600 



ACCESS BASIC USER PROFILE 



y 



RECEIVE ROLE OF INTELLIGENT SOCIAL AGENT FOR APPLICATION PROGRAM 



APPLY APPEAL RULE TO SELECT VISUAL APPEARANCE FOR INTELLIGENT SOCIAL AGENT 



620^ 



APPLY APPROPRIATENESS RULE TO MODIFY VISUAL APPEARANCE FOR INTELLIGENT SOCIAL AGENT ' 



PRESENT VISUAL APPEARANCE FOR INTELLIGENT SOCIAL AGENT 



APPLY APPEAL RULE TO SELECT VOICE FOR INTELLIGENT SOCIAL AGENT 



APPLY APPROPRIATENESS RULE TO MODIFY VOICE FOR INTELLIGENT SOCIAL AGENT 



PRESENT VOICE FOR INTELLIGENT SOCIAL AGENT 



ASSOICATE INTELLIGENT SOCIAL AGENT WITH A USER 



FIG. 6 



WO 03/073417 



8/11 



PCT7US03/06218 




FIG. 7 



WO 03/073417 



9/11 



PCT/US03/06218 




FIG. 8 



WO 03/073417 



10/11 



PCT/US03/06218 




COMPUTING DEVICE OPERATION APPLICATION PROGRAM 




BATTERY STATUS 


OPEN 
APPLICATION 
PROGRAM 




CLOSE 
APPLICATION 
PROGRAM 


SYNCHRONIZE 
DATA 



FIG. 9 



WO 03/073417 



11/11 



PCT/US03/06218 




FIG. 10 



